Multichannel speech dereverberation based on convolutive nonnegative tensor factorization for ASR applications

نویسندگان

Seyedmahdad Mirsamadi

John H. L. Hansen

چکیده

Room reverberation is a primary cause of failure in distant speech recognition (DSR) systems. In this study, we present a multichannel spectrum enhancement method for reverberant speech recognition, which is an extension of a single-channel dereverberation algorithm based on convolutive nonnegative matrix factorization (NMF). The generalization to a multichannel scenario is shown to be a special case of convolutive nonnegative tensor factorization (NTF). The presented algorithm integrates information from across different channels in the magnitude short time Fourier transform (STFT) domain. By doing so, it eliminates any limitations on the array geometry or a need for information concerning the source location, making the algorithm particularly suitable for distributed microphone arrays. Experiments are performed on speech data using actual room impulse responses from AIR database. Relative WER improvements using a clean-trained ASR system vary from +7.1% to +30.1% based on the number of channels and the source to microphone distances (1 to 3 meters).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mixed penalization in convolutive nonnegative matrix factorization for blind speech dereverberation

When a signal is recorded in an enclosed room, it typically gets affected by reverberation. This degradation represents a problem when dealing with audio signals, particularly in the field of speech signal processing, such as automatic speech recognition. Although there are some approaches to deal with this issue that are quite satisfactory under certain conditions, constructing a method that w...

متن کامل

Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation Factorisation en matrices à coefficients positifs de données multicanal convolutives pour la séparation de sources audio

We consider inference in a general data-driven object-based model of multichannel audio data, assumed generated as a possibly underdetermined convolutive mixture of source signals. We work in the Short-Time Fourier Transform (STFT) domain, where convolution is routinely approximated as linear instantaneous mixing in each frequency band. Each source STFT is given a model inspired from nonnegativ...

متن کامل

Adaptive Multichannel Dereverberation for Automatic Speech Recognition

Reverberation is known to degrade the performance of automatic speech recognition (ASR) systems dramatically in farfield conditions. Adopting the weighted prediction error (WPE) approach, we formulate an online dereverberation algorithm for a multi-microphone array. The key contributions of this paper are: (a) we demonstrate that dereverberation using WPE improves performance even when the acou...

متن کامل

Multi-step linear prediction based speech dereverberation in noisy reverberant environment

A speech signal captured by a distant microphone is generally contaminated by reverberation and background noise, which severely degrade the automatic speech recognition (ASR) performance. In this paper, we first extend a previously proposed single channel dereverberation algorithm to a multi-channel scenario. The method estimates late reflections using multichannel multi-step linear prediction...

متن کامل

Speech enhancement using convolutive nonnegative matrix factorization with cosparsity regularization

A novel method for speech enhancement based on Convolutive Non-negative Matrix Factorization (CNMF) is presented in this paper. The sparsity of activation matrix for speech components has already been utilized in NMF-based enhancement methods. However such methods do not usually take into account prior knowledge about occurrence relations between different speech components. By introducing the ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Multichannel speech dereverberation based on convolutive nonnegative tensor factorization for ASR applications

نویسندگان

چکیده

منابع مشابه

Mixed penalization in convolutive nonnegative matrix factorization for blind speech dereverberation

Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation Factorisation en matrices à coefficients positifs de données multicanal convolutives pour la séparation de sources audio

Adaptive Multichannel Dereverberation for Automatic Speech Recognition

Multi-step linear prediction based speech dereverberation in noisy reverberant environment

Speech enhancement using convolutive nonnegative matrix factorization with cosparsity regularization

عنوان ژورنال:

اشتراک گذاری